13 research outputs found

    AdaBiM: An adaptive proximal gradient method for structured convex bilevel optimization

    Full text link
    Bilevel optimization is a comprehensive framework that bridges single- and multi-objective optimization. It encompasses many general formulations, including, but not limited to, standard nonlinear programs. This work demonstrates how elementary proximal gradient iterations can be used to solve a wide class of convex bilevel optimization problems without involving subroutines. Compared to and improving upon existing methods, ours (1) can handle a wider class of problems, including nonsmooth terms in the upper and lower level problems, (2) does not require strong convexity or global Lipschitz gradient continuity assumptions, and (3) provides a systematic adaptive stepsize selection strategy, allowing for the use of large stepsizes while being insensitive to the choice of parameters

    Bregman Finito/MISO for nonconvex regularized finite sum minimization without Lipschitz gradient continuity

    Full text link
    We introduce two algorithms for nonconvex regularized finite sum minimization, where typical Lipschitz differentiability assumptions are relaxed to the notion of relative smoothness. The first one is a Bregman extension of Finito/MISO, studied for fully nonconvex problems when the sampling is random, or under convexity of the nonsmooth term when it is essentially cyclic. The second algorithm is a low-memory variant, in the spirit of SVRG and SARAH, that also allows for fully nonconvex formulations. Our analysis is made remarkably simple by employing a Bregman Moreau envelope as Lyapunov function. In the randomized case, linear convergence is established when the cost function is strongly convex, yet with no convexity requirements on the individual functions in the sum. For the essentially cyclic and low-memory variants, global and linear convergence results are established when the cost function satisfies the Kurdyka-\L ojasiewicz property

    Adaptive proximal algorithms for convex optimization under local Lipschitz continuity of the gradient

    Full text link
    Backtracking linesearch is the de facto approach for minimizing continuously differentiable functions with locally Lipschitz gradient. In recent years, it has been shown that in the convex setting it is possible to avoid linesearch altogether, and to allow the stepsize to adapt based on a local smoothness estimate without any backtracks or evaluations of the function value. In this work we propose an adaptive proximal gradient method, adaPG, that uses novel estimates of the local smoothness modulus which leads to less conservative stepsize updates and that can additionally cope with nonsmooth terms. This idea is extended to the primal-dual setting where an adaptive three term primal-dual algorithm, adaPD, is proposed which can be viewed as an extension of the PDHG method. Moreover, in this setting the ``essentially'' fully adaptive variant adaPD+^+ is proposed that avoids evaluating the linear operator norm by invoking a backtracking procedure, that, remarkably, does not require extra gradient evaluations. Numerical simulations demonstrate the effectiveness of the proposed algorithms compared to the state of the art

    Distributed proximal algorithms for large-scale structured optimization

    Get PDF
    Efficient first-order algorithms for large-scale distributed optimization is the main subject of investigation in this thesis. The algorithms considered cover a wide array of applications in machine learning, signal processing and control. In recent years, a large number of algorithms have been introduced that rely on (possibly a reformulation of) one of the classical splitting algorithms, specifically forward-backward, Douglas-Rachford and forward-backward-forward splittings. In this thesis a new three term splitting technique is developed that recovers forward-backward and Douglas-Rachford splittings as special cases. In the context of structured optimization, this splitting is leveraged to develop a framework for a large class of primal-dual algorithms providing a unified convergence analysis for many seemingly unrelated algorithms. Moreover, linear convergence is established for all such algorithms under mild regularity conditions for the cost functions. As another notable contribution we propose a randomized block-coordinate primal-dual algorithm that leads to a fully distributed asynchronous algorithm in a multi-agent model. Moreover, when specializing to multi-agent structured optimization over graphs, novel algorithms are proposed. In addition, it is shown that in a multi-agent model bounded communication delays are tolerated by primal-dual algorithms provided that certain strong convexity assumptions hold. In the final chapter we depart from convex analysis and consider a fully nonconvex block-coordinate proximal gradient algorithm and show that it leads to nonconvex incremental aggregated algorithms for regularized finite sum and sharing problems with very general sampling strategies

    Asymmetric forward-backward-adjoint splitting for solving monotone inclusions involving three operators

    No full text
    © 2017, Springer Science+Business Media New York. In this work we propose a new splitting technique, namely Asymmetric Forward–Backward–Adjoint splitting, for solving monotone inclusions involving three terms, a maximally monotone, a cocoercive and a bounded linear operator. Our scheme can not be recovered from existing operator splitting methods, while classical methods like Douglas–Rachford and Forward–Backward splitting are special cases of the new algorithm. Asymmetric preconditioning is the main feature of Asymmetric Forward–Backward–Adjoint splitting, that allows us to unify, extend and shed light on the connections between many seemingly unrelated primal-dual algorithms for solving structured convex optimization problems proposed in recent years. One important special case leads to a Douglas–Rachford type scheme that includes a third cocoercive operator.status: publishe

    A New Randomized Block-Coordinate Primal-Dual Proximal Algorithm for Distributed Optimization

    No full text
    This paper proposes TriPD, a new primal-dual algorithm for minimizing the sum of a Lipschitz-differentiable convex function and two possibly nonsmooth convex functions, one of which is composed with a linear mapping. We devise a randomized block-coordinate version of the algorithm which converges under the same stepsize conditions as the full algorithm. It is shown that both the original as well as the block-coordinate scheme feature linear convergence rate when the functions involved are either piecewise linear-quadratic, or when they satisfy a certain quadratic growth condition (which is weaker than strong convexity). Moreover, we apply the developed algorithms to the problem of multi-agent optimization on a graph, thus obtaining novel synchronous and asynchronous distributed methods. The proposed algorithms are fully distributed in the sense that the updates and the stepsizes of each agent only depend on local information. In fact, no prior global coordination is required. Finally, we showcase an application of our algorithm in distributed formation control.status: publishe
    corecore